Polycomb protein binding and looping in the ON transcriptional state

Polycomb group (PcG) proteins mediate epigenetic silencing of important developmental genes by modifying histones and compacting chromatin through two major protein complexes, PRC1 and PRC2. These complexes are recruited to DNA by CpG islands (CGIs) in mammals and Polycomb response elements (PREs) in Drosophila. When PcG target genes are turned OFF, PcG proteins bind to PREs or CGIs, and PREs serve as anchors that loop together and stabilize gene silencing. Here, we address which PcG proteins bind to PREs and whether PREs mediate looping when their targets are in the ON transcriptional state. While the binding of most PcG proteins decreases at PREs in the ON state, one PRC1 component, Ph, remains bound. Further, PREs can loop to each other and with presumptive enhancers in the ON state and, like CGIs, may act as tethering elements between promoters and enhancers. Overall, our data suggest that PREs are important looping elements for developmental loci in both the ON and OFF states.


ChIP-seq
Cell fixation: For each cell type started with 30 mL of cells at 610 6 cells/ml.Added 1.87 mL of 16% formaldehyde and incubated at RT for 10 min with gentle rocking.The formaldehyde was quenched by adding glycine to a final concentration of 0.125 M followed by a 5 min incubation at RT with gentle rocking.The cells were spun down at 2000 g, 3 min, and washed twice with ice cold PBS.The cells were resuspended in 600 µL PBS, divided into 100 µL aliquots, spun down and the pellets were flash frozen and stored at -80 ℃.Each aliquot of cells can be used for up to 5 ChIP reactions.Cells were fixed and frozen from different flasks to generate independent biological replicates.

X-ChIP:
The frozen formaldehyde fixed cell pellets were resuspended in 0.8 ml ice-cold cell lysis buffer (5 mM PIPES pH 8, 85 mM KCl, 0.5% NP40, supplemented with protease inhibitors (Roche Complete EDTA-free protease inhibitor), incubated on ice for 10 min, then pelleted by centrifugation at 2000 g for 5 min at 4 ℃.The supernatant was removed, and the pellet resuspended in 1 mL nuclear lysis buffer (50 mM Tris-HCl pH 8, 10 mM EDTA, 0.2% SDS, supplemented protease inhibitors) and incubated for 10 min at 4 ℃ on a rocking platform.0.5 ml of 400 mM NaCl IP dilution buffer (16.7 mM Tris-HCl pH 8, 1.2 mM EDTA, 400 mM NaCl, 1.1% Triton X100, 0.01% SDS, supplemented with protease inhibitors) was added then gently mixed.The lysate was sonicated in 300 µL aliquots using a Q Sonica (Model Q800R3), at 40% amplitude for 30 sec OFF and 30 sec ON for a total of 5 min ON time.The sonicated samples were spun for 10 min at full speed in, an Eppendorf centrifuge at 4 ℃.The 300 µL aliquots of each cell type sample were pooled again.4 µL was removed as input (2% of a single ChIP reaction).To the remainder 100 µL of TE washed Protein A SepharoseTM Fast Flow (Cytiva) was added, and the samples were incubated at 4 ℃ for 1 hour with gentle rocking.The samples were spun at full speed in an Eppendorf centrifuge for 1 min, and the supernatant transferred to a fresh tube.For each IP, 200 µL of sonicated the sample was used + 800 µL 67 mM NaCl ChIP dilution buffer (16.7 mM Tris-HCl pH 8, 1.2 mM EDTA, 67 mM NaCl, 1.1% Triton X100, 0.01% SDS, supplemented with protease inhibitors.The appropriate amount of antibody (Supplemental Table 1) was added and incubated rocking overnight at 4 ℃.60 µL protein A/agarose bead slurry (prewashed with TE buffer prior to use) for 1 hour at 4 ℃ with rotation.The agarose was pelleted by centrifugation at 800 rpm at 4 ℃ for 1 min, the supernatant was carefully removed and the agarose was washed 5 min on a rotating platform at 4 ℃ sequentially with 1ml of the following buffers: Low salt immune complex buffer (0.1% SDS, 1% Triton X-100, 2 mM EDTA, 20 mM Tris-HCl, pH 8.0, 150 mM NaCl), High salt immune complex wash (0.1% SDS, 1% Triton X-100, 2 mM EDTA, 20 mM Tris-HCl pH 8.0, 200 mM NaCl), LiCl immune complex wash (0.25 M LiCl, 1% NP40, 1% deoxycholic acid (sodium salt), 1 mM EDTA, 10 mM Tris pH 8.0) followed by two 5 min washes with TE buffer.The agarose was pelleted again at 800 rpm for 1 min and the DNA was eluted with 500 µL freshly made elution buffer (1% SDS, 0.1 M NaHCO3) for 30 min with gentle rocking.The agarose was pelleted, and the supernatant transferred to a fresh tube.The crosslinks were reversed by adding 20 µL of 5 M NaCl and incubating at 65 ℃ for 4 hours.The cross links were also reversed in the input sample.10 µL of 0.5 M ETDA, 20 µL of 1 M Tris-HCl pH 6.5 and 2 µL of 10 mg/mL proteinase K were added and the samples incubated at 45 ℃ for 1 hour.The DNA was recovered by phenol/chloroform extraction followed by ethanol precipitation with 2 µL of pellet paint, 50 µL 5 M NaOAc, 1 mL Ethanol.Washed and dried pellets were resuspended in PCR grade water.
ChIP-seq library construction: 1.5 ng of ChIP sample DNA (as measured by a Qubit 3.0 fluorimeter) was used to prepare each library.Libraries were made using either the Thruplex DNA-seq and dual index kits (Takara), or the NEBNext Ultra TM II DNA library preparation kit and dual index kits (New England Biolabs).Libraries were made according to the manufacturer's directions.Samples were sequenced by 50 bp pair-end sequencing with a NovaSeq 6000 with an SP100 kit by the NICHD Molecular Genomics Core.

RNA-seq
Library preparation and sequencing: For each cell line RNA was made from 4 separate tissue culture flasks to generate four independent replicates.For each RNA prep 110 7 cells spun down (1000 g, 3 min), washed twice with ice-cold PBS, and resuspended in 1 mL of ice-cold PBS, transferred to a 1.5 mL Eppendorf tube and spun down at full speed in a microfuge for 2 min at 4 ℃.The pellet was resuspended in 100 µL PBS.Added 1 mL of Trizol to the cell suspension and vortexed for 1 min.Incubated the samples at RT for 15 min.Added 200 µL of chloroform and vortexed for 2 min.Centrifuged the samples at 13,200 rpm 15 min at 4 ℃.Carefully transferred the top aqueous layer containing the RNA to a fresh tube.Measured the RNA concentration using the broad range Qubit RNA kit and a Qubit 3.0 fluorimeter (Invitrogen).An aliquot of the total RNA was further purified using the Qiagen RNeasy Micro kit following the manufacturer's directions.Libraries were made using the Illumina TruSeq Stranded mRNA sample prep kit, and then run on a NovaSeq 6000 using a SP 200 kit by the NICHD Molecular Genomics Core.

Micro-C
First, 25 million cells were pelleted and then fixed with DSG followed by formaldehyde (FA).50 mg of DSG was resuspended in 500 µL DMSO and diluted with 50 mL PBS.The cells pellets were resuspended in the DSG solution at a concentration of 110 6 cells/mL.Cells were incubated gently rocking for 35 min at room temperature (RT).16% FA was added dropwise to a final concentration of 1%, incubated 10 min rocking at RT. Glycine was added to a final concentration of 0.13 M, incubated 5 min at RT and 5 min on ice.Cells were pelleted at 1000 g for 5 min, washed with ice-cold PBS at a concentration of 110 6 cells/mL, centrifuged at 2500 g for 5 min at 4 ℃, then washed with ice-cold PBS at a concentration of 110 6 cells/100 µL.Cells were counted the aliquoted in 1 or 510 6 aliquots in protein low bind tubes.Cells were centrifuged at 2500 g for 5 min at 4 ℃, and the cell pellets, flash frozen -80 ℃.MNase was titrated with 110 6 cells at 3U, 5U and 7U for 20 min at 37 ℃.The DNA purification step was carried out using a Zymoclean DNA clean and concentrator kit.All DNA concentrations were measured with a Qubit and fragment sizes were assessed using a high sensitivity Tapestation (Agilent).The micrococcal nuclease reaction was carried out with 510 6 cells, and the micrococcal nuclease step was scaled up to 500 µL with the optimal concentration of micrococcal nuclease, incubation was at 37 ℃ was for 20 min.Centrifugation steps were at 3000 g.BSA to 100 µg/mL was added to MB#2 and MB#3 solutions just before use to help pellet disruption.DNA fragment end repair was carried out in a 95 µl end chewing Master mix + 5 µl 10U/µL PNK.50µL of end labeling mix was added per sample.Centrifugations were at 5000 g.For proximity end ligation, pellets were resuspended in 500 µL of ligation master mix.After phenol/chloroform/ iso-amyl alcohol extraction, the sample was split into two equal aliquots and was purified on two Zymo DNA clean and concentrator kit columns.Samples were eluted with 25 µL (preheated to 70 ℃) elution buffer then pooled.Samples were loaded onto the 3% TBE NuSieve GTG agarose gel in 4 separate wells.After excision from each lane samples were purified using a Zymo Gel DNA Recovery kit to extract the DNA.Elution was with 75 µL elution buffer preheated to 70 ℃.The 4 samples were then pooled to give a 300 µl sample.Enrichment of dinucleotides was confirmed by running an aliquot on an Agilent high sensitivity Tapestation.For biotin purification, 50 µL of Streptavidin beads were washed twice with 400 µL TBW, resuspended in 300 µL 2 BW then added to the 300 µL Micro-C sample followed by a 50 min incubation rotating at RT. Beads were washed twice with 600 µL TBW at 55 ℃ in a Thermomixer for 2 min.The beads were washed one time with 100 µL 10mM Tris, then resuspended in 50 µl 10 mM Tris.Libraries were prepared using the KAPA Biosystems HyperPrep kit and Illumina primers.End repair and A tailing was carried out as recommended by the manufacturer, adapter ligation was carried out with 1 µL annealed primer at 15 µM, ligation was 60 min at RT, gently mixing the beads every 10 min.300 µL TWB was added, vortexed briefly and placed on a magnet and the supernatant removed.The beads were washed as above with 600 µL TWB, followed by 100 µL 10 mM Tris-HCl (pH8.0) then resuspended in 84 µL 10 mM Tris-HCl (pH8.0).After running a small-scale PCR to determine the optimal number of cycles for the required yield., 420 µL PCR reactions were set up for each sample.PCR: 98 ℃ 120 seconds, (98 ℃ 30 seconds, 65 ℃ 20 seconds, 72 ℃ 15 seconds)  12 cycles, 72 ℃ 3 min.PCR reactions were pooled, the beads removed on a magnetic separator.200 µL was transferred to a new tube and incubated with 0.9x SPRI beads.The final elution step is with 25 µL of 10 mM Tris-HCl (pH8.0).Samples were sequenced by 50 bp pair-end sequencing with a NovaSeq 6000 with an SP100 kit by the NICHD Molecular Genomics Core.

Fig. S2 Transcriptional pattern and the binding of various PcG proteins over the inv-en locus in S2 and D17 cells
The IGV tracks show the RNA-seq and normalized ChIP-seq data of various PcG proteins over the inv-en locus and the neighboring genes E(Pc) and tou.E(z), Pcl, C-trx and input are scaled at 0-7, and all other tracks are scaled at 0-10.

Fig. S3. Transcriptional pattern and the binding of various PcG proteins over the abd-A/abd-B region in S2 and D17 cells
The IGV tracks show the RNA-seq and normalized ChIP-seq data of various PcG proteins over the abd-A and abd-B region in S2 and D17 cells.E(z), Pcl, C-trx and input are scaled at 0-7, and all other tracks are scaled at 0-10.The red boxes highlight new binding peaks of GAF, Spps, and Ph and corresponding increased accessibility of chromatin in the ATAC-seq samples.The new peaks of binding within the inv gene are enlarged below.Three areas are highlighted by the black lines and double headed arrows to indicate fragments that were tested for enhancer activity by previous study (54).The two fragments marked by asterisks showed enhancer activity, both fragments contain one of the new GAF peaks and the increased accessibility of the chromatin.ATAC-seq is scaled at 0-5, GAF and Spps at 0-7, and Ph at 0-10.

Fig. S6 PcG binding and chromatin structure over the inv-en domain in S2 and D17 cells
This figure is related to Figure 3, except that here shows the large view.The contact map at top shows the differences between S2 and D17 cells, while the other two are for each cell type.The arrow shows the en transcription unit forms its own small domain in S2 cells, whereas it is in the same domain as inv in D17 cells.The significant loops are visualized as arcs just below the gene models, and the matched loop dots are also highlighted by green circles in the contact maps.At the bottom, the intensity for H3K27me3, H3K27ac, Pho, GAF and Ph are shown, with the D17specific GAF binding sites highlighted in orange rectangles.The positions of the inv-en PREs are also indicated by dash lines and labelled at bottom.

Fig. S1 .
Fig. S1.Distribution of H3K4me2/3 and H3K36me2/3 over the inv-en locus in S2 and D17 cells (A) IGV tracks show the distribution of H3K4me2, H3K4me3, H3K36me2 and H3K36me3 over the inv-en locus and the neighboring genes E(Pc), and tou.A Ph track from larval tissue is included to indicate the positions of the PREs.All tracks are scaled at 0-7. (B) Similar to A, but shows an enlarged region for the en and inv PREs and transcription units.

Fig. S4 .
Fig. S4.Binding of core PcG and related proteins at abd-b PREs in S2 and D17 cells This figure is related to the Fig. 2, except that it is for the abd-b locus.(A,B) IGV tracks show the occupancy of different PcG complex (PRC1, PRC2, Pho-RC), DNA binding factors and chromatin accessibility on the two abd-b PREs as control.(B) Bar plots show the differences in ChIP signals for different PcG proteins and related factors at each abd-b PREs between D17 and S2 cells.

Fig. S5 .
Fig. S5.ATAC-seq and ChIP-seq over inv and en in S2 and D17 cellsThe red boxes highlight new binding peaks of GAF, Spps, and Ph and corresponding increased accessibility of chromatin in the ATAC-seq samples.The new peaks of binding within the inv gene are enlarged below.Three areas are highlighted by the black lines and double headed arrows to indicate fragments that were tested for enhancer activity by previous study(54).The two fragments marked by asterisks showed enhancer activity, both fragments contain one of the new GAF peaks and the increased accessibility of the chromatin.ATAC-seq is scaled at 0-5, GAF and Spps at 0-7, and Ph at 0-10.

Fig. S7 .
Fig. S7.Differential gene expression between S2 and D17 cells (A) Heatmap shows the clustering of different samples based on their transcriptomes.The color gradient indicates the Pearson's r. (B) PCA plot shows the relationship of different samples based on their transcriptomes.The top 500 genes with highest expression variation across samples were used.(C) MA plot shows the differentially expressed analysis result between the two cell types.(D) Heatmap shows the expression of the identified DEGs between S2 and D17 cells.The color gradient indicates the row z-score.

Fig. S8 .
Fig. S8.Epigenetic and transcriptomic alterations of the subdomains at bab1/bab2 locus between S2 and D17 cellsThe IGV tracks show the RNA-seq data together with ChIP-seq data for H3K27me3, H3K27ac, Ph, Psc, Pc, E(z) at the bab1/2 locus in S2 and D17 cells.The two subdomains that overlap bab2 genes are labelled, and their DB analysis result is provided in the table below.

Fig. S9 .
Fig. S9.Transcription, PcG protein binding and chromatin structure over the zfh2 domain in S2 and D17 cells The top panel shows RNA-seq, ATAC-seq and ChIP-seq data for zfh2 domain in S2 and D17 cells.RNA-seq is scaled to 0-8, ATAC-seq, H3K27me3, H3K27ac, 0-5, Spps and GAF 0-10 and all other tracks at 0-7.The lower panel shows micro-C data over the same region in S2 and D17 cells at a 200 bp resolution.